Real-time multi-pattern detection over event streams

ABSTRACT

A system comprising: at least one hardware processor; and a non-transitory computer-readable storage medium having stored thereon program instructions, the program instructions executable by the at least one hardware processor to: receive a data stream representing events; receive a plurality of complex event patterns (CEPs) comprising (a) a set of conditions reflecting relations among said events, and (b) a set of attributes associated with each of said events; and calculate an optimal multi-pattern evaluation plan corresponding to said CEPs by: (i) generating an initial evaluation plan, (ii) applying a search method to calculate modified versions of said initial evaluation plan, (iii) assigning a score to each of said modified versions based on a cost function, and (iv) selecting one of said modified versions having a highest said score as said optimal multi-pattern evaluation plan.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a National Phase of PCT Patent Application No.PCT/IL2020/050018 having International filing date of Jan. 7, 2020,which claims the benefit of priority of U.S. Provisional PatentApplication No. 62/789,017, filed Jan. 7, 2019, the contents of whichare all incorporated herein by reference in their entirety.

BACKGROUND

This invention relates to the field of computerized complex eventprocessing.

Rapid advances in data-driven applications over recent years haveintensified the need for efficient mechanisms capable of monitoring anddetecting arbitrarily complex patterns in massive data streams. Thistask is usually performed by complex event processing (CEP) systems. CEPengines are required to process hundreds or even thousands ofuser-defined patterns in parallel under tight real-time constraints. Toenhance the performance of this crucial operation, multiple techniqueshave been developed, utilizing well-known optimization approaches suchas pattern rewriting and sharing common subexpressions. However, thescalability of these methods is limited by the high computationoverhead, and the quality of the produced plans is compromised byignoring significant parts of the solution space.

The foregoing examples of the related art and limitations relatedtherewith are intended to be illustrative and not exclusive. Otherlimitations of the related art will become apparent to those of skill inthe art upon a reading of the specification and a study of the figures.

SUMMARY

The following embodiments and aspects thereof are described andillustrated in conjunction with systems, tools and methods which aremeant to be exemplary and illustrative, not limiting in scope.

There is provide, in an embodiment, a system comprising: at least onehardware processor; and a non-transitory computer-readable storagemedium having stored thereon program instructions, the programinstructions executable by the at least one hardware processor to:receive, as input, a data stream representing events; receive, as input,a plurality of complex event patterns (CEPs), each representing anoccurrence of a respective CEP in said data stream, wherein each of saidCEPs comprises (a) a set of conditions reflecting relations among saidevents, and (b) a set of attributes associated with each of said events;and calculate an optimal multi-pattern evaluation plan corresponding tosaid plurality of CEPs, wherein said multi-pattern evaluation plan iscreated by: (i) generating an initial evaluation plan, (ii) applying asearch method to calculate modified versions of said initial evaluationplan, (iii) assigning a score to each of said modified versions based ona cost function, and (iv) selecting one of said modified versions havinga highest said score as said optimal multi-pattern evaluation plan.

There is also provided, in an embodiment, a method comprising:receiving, as input, a data stream representing events; receiving, asinput, a plurality of complex event patterns (CEPs), each representingan occurrence of a respective CEP in said data stream, wherein each ofsaid CEPs comprises (a) a set of conditions reflecting relations amongsaid events, and (b) a set of attributes associated with each of saidevents; and calculating an optimal multi-pattern evaluation plancorresponding to said plurality of CEPs, wherein said multi-patternevaluation plan is created by: (i) generating an initial evaluationplan, (ii) applying a search method to calculate modified versions ofsaid initial evaluation plan, (iii) assigning a score to each of saidmodified versions based on a cost function, and (iv) selecting one ofsaid modified versions having a highest said score as said optimalmulti-pattern evaluation plan.

There is further provided, in an embodiment, a computer program productcomprising a non-transitory computer-readable storage medium havingprogram instructions embodied therewith, the program instructionsexecutable by at least one hardware processor to: receive, as input, adata stream representing events; receive, as input, a plurality ofcomplex event patterns (CEPs), each representing an occurrence of arespective CEP in said data stream, wherein each of said CEPs comprises(a) a set of conditions reflecting relations among said events, and (b)a set of attributes associated with each of said events; and calculatean optimal multi-pattern evaluation plan corresponding to said pluralityof CEPs, wherein said multi-pattern evaluation plan is created by: (i)generating an initial evaluation plan, (ii) applying a search method tocalculate modified versions of said initial evaluation plan, (iii)assigning a score to each of said modified versions based on a costfunction, and (iv) selecting one of said modified versions having ahighest said score as said optimal multi-pattern evaluation plan.

In some embodiments, the search is based, at least in part, on (i)reordering of said events in each of said CEPs to maximize commonsub-patterns among said CEPs; and (ii) sharing of said commonsub-patterns among all of said CEPs.

In some embodiments, the cost function minimizes a number of estimatedintermediate results during an execution of said modified version.

In some embodiments, steps (ii) and (iii) are repeated iteratively basedon one of: a specified time limit, and a specified number of iterations.

In some embodiments, the CEPs are based on user definition.

In some embodiments, the program instructions are further executable toexecute, and the method further comprises executing, said multi-patternevaluation plan on said data stream, to generate output data.

In addition to the exemplary aspects and embodiments described above,further aspects and embodiments will become apparent by reference to thefigures and by study of the following detailed description.

BRIEF DESCRIPTION OF THE FIGURES

Exemplary embodiments are illustrated in referenced figures Dimensionsof components and features shown in the figures are generally chosen forconvenience and clarity of presentation and are not necessarily shown toscale. The figures are listed below.

FIGS. 1A-1B show evaluation mechanisms for a sequence of events usingNFA with and without reordering;

FIGS. 2A-2B show NFA sharing example for event sequences;

FIGS. 3A-3D show NFA optimization example for event sequences with nosharing or reordering, with reordering and without sharing, and withsharing and without reordering;

FIG. 4 is schematic structure of an exemplary MCEP systems, inaccordance with some embodiments of the present invention;

FIGS. 5A-5C show multi-pattern trees for a workload consisting usingdifferent evaluation orders;

FIGS. 6A-6C show MPT modification examples following the addition orremoval of a local evaluation plan, in accordance with some embodimentsof the present invention;

FIG. 7 shows a multi-pattern graph for a workload of 6 patterns, inaccordance with some embodiments of the present invention;

FIGS. 8A-8C show exemplary tree-based plans for a pattern, in accordancewith some embodiments of the present invention;

FIG. 9 shows a multi-pattern multitree for a shared workload ofpatterns, in accordance with some embodiments of the present invention;and

FIGS. 10A-14D show experimental results, in accordance with someembodiments of the present invention.

DETAILED DESCRIPTION

Disclosed herein are a system, method and computer program product forreal-time multi-pattern complex event processing (Multi-pattern CEP orMCEP).

In some embodiments, the present disclosure provides for optimizing MCEPperformance using a combination of sharing and pattern reorderingtechniques. In some embodiments, the present disclosure presents anoptimization framework for solving this computationally hard problemunder tight real-time conditions. In some embodiments, the presentdisclosure demonstrated in experimental evaluation a significantperformance improvement as compared to known techniques.

In some embodiments, the present disclosure is based on formulating theMCEP task as a global optimization problem, and applying a combinationof sharing and pattern reordering techniques to construct an optimalplan satisfying the problem constraints.

In some embodiments, the present disclosure provides for locating a bestpossible evaluation plan in a hyper-exponential solution space, usingefficient local search algorithms that utilize the unique problemstructure.

Complex event processing (CEP) methods are widely employed inapplications where arbitrarily complex combinations (patterns) of dataitems must be promptly and efficiently detected in massive data streams.Examples of such areas include financial services, electronic healthrecord systems, sensor networks, and Internet-of-Things (Iot)applications.

CEP systems treat data items as events arriving from event sources. Asnew events are detected, they are combined into higher-level complexevents matching the user-specified patterns.

Modern CEP engines are typically required to support efficientsimultaneous tracking of hundreds to thousands of patterns in multiplehigh-speed input streams of events. Systems possessing thisfunctionality may be referred to an as multi-pattern complex eventprocessing (MCEP) systems.

With reference to FIGS. 1A-1B, consider the following scenario: A systemfor managing an array of smart security cameras A, B, C is installed ina building. All cameras are equipped with face recognition software, andperiodical readings from each camera are sent in real time to the mainserver. A detection objective is a scenario in which an intruderaccesses the restricted area via the main gate of the building ratherthan from the dedicated entrance. This pattern can be represented as asequence of three primitive events:

-   -   camera A (installed near the main gate) detects a person;    -   later, camera B (located inside the building's lobby) detects        the same person;    -   finally, camera C detects the same person in the restricted        area.

The system is concerned with detecting a scenario in which an intruderis detected near doorway A, then immediately passes through entrance B,and finally enters doorway C. This pattern can be formulated as asequence of three events, each corresponding to getting a signal fromsensors A, B, and C. A real-life MCEP system could define multiple‘abnormal’ paths inside the building and specify a dedicated pattern foreach path.

Pattern matches in known CEP systems are detected using an evaluationmechanism. One of the most prominent evaluation mechanisms is thenon-deterministic finite automaton (NFA). FIGS. 1A-1B present an exampleof an NFA for detecting the sequence A→B→C of sensor signals. A state isdefined for each prefix of a valid match. Every ‘accepting’ transitionbetween states is associated with some event type. The detection istriggered by the arrival of a signal from sensor A. For each acceptedsignal, the stream of events from sensor B is probed. If a new signal issubsequently received from B, a corresponding event from sensor C isthen checked.

During evaluation, an NFA keeps track of partial matches, that is,already detected subsets of a potential pattern match. A newly arrivedevent is combined with all currently stored partial matchescorresponding to the state accepting this event. For instance, an eventof type C will be matched with pairs of As and Bs. Accordingly, theknown MCEP architecture leads to the worst-case exponential (in terms ofpattern size) processing time and memory consumption.

Thus, it would be advantageous to maximize pattern detection performancein MCEP systems.

Attempts to make MCEP more efficient have targeted various possibilitiesfor creating efficient evaluation mechanisms. Two of the most popularoptimization strategies are pattern rewriting and pattern sharing.

Pattern rewriting methods exploit the statistical properties of theevent data to replace the evaluation mechanism with an equivalent yetmore efficient one. Pattern reordering is a more specific techniquewithin this category, focused on modifying the order in which the eventsare processed. For example, if sensor C generates significantly fewersignals than A and B do, then instead of following the order A→B→Cspecified by the pattern, it would be beneficial to first wait for asignal from C, then examine the local history for previous signalsreceived from sensors B and A. This way, fewer partial matches would becreated, resulting in better memory utilization and faster processing ofincoming events. FIG. 1B depicts an NFA constructed according to thisimproved plan.

Pattern sharing methods utilize the structural similarities betweendifferent patterns to unify the processing of common subexpressions.FIGS. 2A-2B illustrate this principle. For presentational purposes,‘ignore’ edges and ‘accept’ labels are omitted. The system monitors apair of patterns P₁: A→B→C and P₂: A→B→D. Instead of processing thesepatterns independently (as in FIG. 1C), the system can merge the firstthree states of the respective NFAs to produce a joint automaton FIG.1D). This optimization avoids duplicate instantiating and storing ofpartial matches.

Pattern reordering and pattern sharing are generally considered asorthogonal techniques and cover different aspects of CEP performanceoptimization. This also implies that each of the two methods overlookscertain opportunities exploited by the other.

Accordingly, a fusion of both approaches could discover evaluation plansthat would not be considered otherwise. This may be illustrated withreference to FIG. 3A-3D, illustrating the following arrangements:

-   -   FIG. 3A: NFA optimization with no sharing or reordering;    -   FIG. 3B: NFA optimization with reordering and no sharing;    -   FIG. 3C: NFA optimization with sharing and no reordering; and    -   FIG. 3D: NFA optimization with both sharing and reordering.

Reordering the patterns in FIG. 3A by the ascending order of eventarrival rates might result in a pair of locally optimal NFAs (FIG. 3B).Alternatively, a global shared plan shown in FIG. 3C can be obtained bysharing the first two states. Now consider a combined application of theabove techniques, where the NFAs are first reordered to maximize thecommon prefix length, and then this newly created sub-pattern is shared.FIG. 3D shows the resulting plan. This plan would never be created ifonly one of the two optimizations was employed, or if they were usedindependently.

Accordingly, in some embodiments, the present disclosure provides for anovel framework for large-scale MCEP. Rather than merely maximize thesharing degree or creating locally optimal plans, the present disclosureprovides for a globally optimal plan for the given workload of patterns,using a combination of both sharing and reordering. In some embodiments,the present disclosure provides for an MCEP optimizer that uses sharingand reordering techniques to generate candidate evaluation plans. Thisfusion permits taking advantage of sub-expressions not normallyconsidered for sharing. To traverse the hyper-exponential space ofplans, the present disclosure further incorporates a method based on thelocal search paradigm. As opposed to known MCEP optimizers, the presentdisclosure can operate under arbitrarily tight time constraints due tothe inherent balance between optimization time and solution quality.

A potential advantage of the present disclosure is, therefore in that itprovides for a novel approach for optimizing large-scale MCEP systems bycombining the power of state-of-the-art pattern sharing and reorderingtechniques. In some embodiments, the present disclosure also providesfor a set of algorithms for efficiently searching the solution space.The present algorithms are highly precise and their execution time canbe arbitrarily limited. In some embodiments, an MCEP engine may be thenimplemented utilizing the plans created by the present optimizer forefficient pattern detection.

Background and Terminology

Formally, an MCEP system accepts three parameters: an input data streamI, a pattern workload WL, and a statistics collection Stat. The inputstream I={e₁, e₂, . . . } is an ordered, possibly infinite temporalsequence of primitive events, or simply events. I is defined as a“logical” input source, possibly encapsulating multiple mergedsubstreams. Each event e_(i) is represented by a well-defined type and aset of attributes, including the occurrence timestamp. In the examplefrom FIGS. 1A-1B, the event type is specified by the origin sensor ID,and the attribute set may include the movement speed of an intruder orthe direction of passing.

The workload WL={P₁, . . . , P_(n)} contains a finite number of patternsthe system is requested to detect. Each pattern is defined by the tupleP_(i)=(

_(i), S_(i), C_(i), W_(i)), where

_(i)={E₁, . . . , E_(m) _(i) } is the set of event types participatingin P_(i), S_(i) denotes the structure of P_(i) (which will be definedshortly), C_(i) is the condition set specifying the constraints on theattribute values of the events, and W_(i) is the time window defined forthis pattern, that is, the maximal allowed time difference between thetimestamps of a pair of events in a match.

The structure S_(i) specifies how the events requested by the patternare to be assembled to form a match. It is defined by a combination ofevent types and operators. In this disclosure, the most common operatorssuch as AND, SEQ, and OR will be considered. The AND operator requiresthe occurrence of all events specified in the pattern. The SEQ operatoralso expects the events to appear in a predefined temporal order. The ORoperator corresponds to the appearance of any event out of thosespecified. Two additional important operators are the negation (NOT),requiring the absence of an event from some position in the match, andthe Kleene closure (KL), accepting one or more instances of an event.

To illustrate the above, the structure of the pattern from FIGS. 1A-1Bcould be summarized as SEQ(A,B,C), with

={A, B, C}. If the order of receiving the signals was not important, thepattern would be formulated as AND (A, B, C). In addition, assume that asignal arriving from the sensor D indicates the arrival of a securityguard to the area, in which case no alarm should be set. Then, thestructure of the pattern would become SEQ (AND (A, B, C), NOT(D)).

In the general case, S_(i) is an arbitrary expression over the aboveoperators. Such patterns can be simplified by the transition to DNFform. From the standpoint of an MCEP system, every clause of theresulting DNF expression can be considered as a separate pattern in aworkload. In addition, a clause containing multiple AND/SEQ operatorscan be flattened to a simple expression featuring a single AND or SEQwith possible NEG and KL applied on single events. Therefore, onlypatterns of this simplified form will be considered herein.

Stat is a set of statistical data properties that are used by the MCEPengine during evaluation plan generation. In the example above, Statcontains the arrival rates of all event types (that is, of signals fromeach sensor). In addition, the selectivities of the conditions definedby the patterns will be considered as members in Stat. The selectivityof a condition is defined as the probability of the input tuple tosuccessfully pass the condition. More formally,

Stat = {r_(x)|∃P_(i) ∈ WL : E_(x) ∈ ℰ_(i)}⋃{sel_(x, y)^(C_(i))|∃P_(i) ∈ WL : E_(x), E_(y) ∈ ℰ_(i)},where r_(x) is the arrival rate of the event type E_(x), and sel_(x,y)^(C)∈[0,1] is the selectivity of a mutual condition between E_(x) andE_(y) in some condition set C (where it is set sel_(x,y) ^(C)=1 if nocondition is defined between the event types). The results can betrivially extended to additional parameters, such as inter-eventdependencies and costs of predicate evaluation, by modifying the costmodel (see below).

FIG. 4 schematically illustrates an exemplary architecture of an MCEPsystem, according to an embodiment. The evaluation mechanism 402 isresponsible for the actual processing of the input stream I. Anevaluation mechanism of choice in FIGS. 1-3 is an NFA. Various worksdescribe different variations of NFAs. In this disclosure, the ‘lazyNFA’ variety will be used exclusively (see, e.g., I. Kolchinsky, I.Sharfman, and A. Schuster. Lazy evaluation methods for detecting complexevents. In DEBS, pages 34-45. ACM, 2015.)

A lazy NFA (FIG. 1B) can be configured to follow any execution orderregardless of the actual order requested by the pattern. Since NFAsgenerally are only capable of tracking a single pattern, an extensionfor multiple patterns will be presented in below.

At runtime, the evaluation mechanism follows an evaluation plan 404supplied by the optimizer 406. A distinction is drawn between localevaluation plans applicable for single-pattern evaluation mechanismsonly, and global evaluation plans that consider a workload of patterns.For example, the plans applied by the NFAs in FIGS. 1A-1B are localevaluation plans, whereas FIGS. 2 and 3 illustrate global evaluationplans.

Different evaluation mechanisms support different types of evaluationplans. Creating a lazy chain-structured NFA (FIG. 1 ) for a singlepattern requires an order-based local evaluation plan. For a pattern Pover the event types E₁, . . . , E_(m), the order-based evaluation planis an ordering O=(E_(q) ₁ , . . . , E_(q) _(m) ), where q₁, . . . ,q_(m) is a permutation of [1, . . . , m]. Any pattern using theoperators defined above (with the exception of OR) can be detected bysuch NFA.

The task of the optimizer 406 is to create a global evaluation plan uponsystem initialization. The resulting plan is then transferred to theevaluation mechanism 402, which subsequently launches the detectionprocess on a stream I. The optimizer 406 typically uses a predefinedcost function to measure the quality of a plan subject to the givenworkload WL and the statistics collection Stat. This function is definedas Cost:

×

×STAT→

, where

,

, STAT are the sets of all global evaluation plans, workloads, andstatistics collections, respectively. The cost assigned by this functionmay reflect performance metrics such as throughput, detection latency,communication cost, and more.

The present analysis below assumes the values in Stat to be constant andknown in advance. However, in real-life scenarios this information israrely obtained in advance and is subject to rapid fluctuations overtime. To overcome this problem, the present disclosure employs standardadaptivity mechanisms, continuously estimating the up-to-date statisticsand relaunching the optimizer when a significant change is detected.

Multi-Pattern CEP with Prefix Sharing

This section presents the core principles and algorithms behind thepresent MCEP system. For presentational purposes, a limited version ofthe present method is described, only considering prefix sharingopportunities between patterns. This description is further extendedbelow to support arbitrary sub-expression sharing.

Multi-Pattern NFA Evaluation

In some embodiments, the present disclosure processes all patterns in aworkload using a single NFA, which is denoted as the multi pattern NFA.It is organized in a tree-like topology formed by merging the commonprefixes of the chain-structured NFAs corresponding to each pattern inthe workload. The root of the tree is shared between all patterns andserves as the initial state of the automaton. Each internal node can beshared between two or more patterns.

Because different patterns may have different time windows, each stateof the multi-pattern NFA is augmented with a special time windowattribute, set to the largest time window among the patterns sharing thestate. The system uses this attribute to decide whether a partial matchhas expired.

FIGS. 5A-5C depict three of the possible multi-pattern NFAs for aworkload of two patterns, P₁: SEQ(A, B, C) and P₂: SEQ(A, B, D), withW₁=10 and W₂=20, where FIG. 5A depicts evaluation orders A,B,C and A,B,D(maximal sharing); FIG. 5B depicts evaluation orders B,C,A and B,A,D;and FIG. 5C depicts evaluation orders C,B,A and A,D,B (minimal sharing).As discussed above, some NFAs have more shared states, while otherscontain more states in total but provide more efficient evaluation pathsfor individual patterns.

For each pattern in a workload, a dedicated final state is defined. Whenthe final state corresponding to some pattern is reached, a match isreported. Note that while final states are typically the leaves of thetree, this is not always the case. For example, in a workload consistingof SEQ(A, B, C) and SEQ(A, B), the final state for SEQ(A, B) is aninternal node.

The evaluation process for multiple patterns is similar to the onepresented in (Kolchinsky [2015]) for single-pattern detection. As a newevent e of type T enters the system, it is evaluated against existingNFA instances. An instance is defined by a combination of a unique stateidentifier and a partial match. The system starts with a single instanceassociated with the initial state and an empty match. All instancesassociated with states containing an outgoing transition for T arematched with e. For every instance satisfying the conditions between theevents (including e), a new instance is created containing the new matchresulting from e's addition and associated with the state to which thetransition leads. When an instance corresponding to some final state iscreated, its match is reported to the end users. An instance exceedingthe time window specified by its associated state is removed from thesystem.

Because the number of instances in a system processing a large workloadmay be huge, traversing all of them on every event arrival isimpractical. Instead, for each event type T, a list l_(T) is defined tocontain all states with an outgoing transition accepting T. The size ofl_(T) can never exceed the number of patterns in a workload containing Tin their specification and will be substantially lower under anefficient sharing strategy that aims to merge states that processinterleaving event types. At runtime, NFA instances are stored in a hashtable according to their associated state, and the arrival of an eventof type T only triggers the traversal of instances associated withstates in l_(T). For example, the state lists of a multi-pattern NFA inFIG. 5B are l_(A)={q₂, q₃}, l_(B)={q₁}, l_(C)={q₂}, l_(D)={q₄}.

Multi-Pattern Tree

Global evaluation plans utilized by multi-pattern NFAs are similarlystructured in a tree-like manner. This plan type may be referred to asthe multi pattern tree (MPT). Given an MPT, a multi-tree NFA isconstructed by simply copying the structure of the former.

As described above, an MPT is created by the optimizer. In someembodiments, the present disclosure provides or an optimizer whichproceeds by creating an initial MPT and repeatedly modifying it. Hence,efficient creation and modification operations are crucial forminimizing the optimization cost. In implementing these operations, thecore principle of MPT behavior is to unconditionally share all shareableprefixes of the supplied local evaluation plans (orders). To add anevaluation order O to an existing MPT, iterations are performed over Oand only create a new node if no equivalent one exists. Two nodes areconsidered equivalent if and only if they correspond to identicalsequences of event types, and if their edges specify identicalconditions. Similarly, a plan is removed by iterating over therespective order and only deleting states that are not shared with otherpatterns.

FIGS. 6A-6C illustrate MPT modification examples, e.g., addition andremoval of a plan from an MPT. The complexity of both operations isO(m), where m is the length of the evaluation order. FIG. 6A depicts anMPT from FIG. 5A and a local plan for a pattern SEQ(A,C,E); FIG. 6Bdepicts the MPT following the addition of the new evaluation plan (thepath corresponding to the newly added plan is highlighted); and FIG. 6Cdepicts the MPT after the local evaluation plan for SEQ(A,B,C) isremoved.

Creating an MPT from a set of orders {O₁, . . . , O_(n)} is implementedby iteratively adding the orders to an initially empty tree. Thisoperation requires O(n·max(m_(i))) time and space, where m₁ is thelength of O_(i).

Since MPTs merge all common prefixes, an MPT can be uniquely defined bythe tuple (O₁, . . . , O_(n)). Forcing some nodes not to be shared isonly possible by modifying the individual evaluation orders. This way,careful selection of local evaluation plans by the optimizer can achievethe perfect balance between sharing degree and local evaluation planquality.

Runtime Complexity and Multi-Pattern Cost Model

This section analyzes the runtime complexity of the MCEP evaluationprocess described above, and derives the cost function definition formulti-pattern trees.

The total cost associated with processing a single event e of type T isthe sum of two components: 1) the cost of combining e with the existingpartial matches and creating new instances as a result of successfulmatching; 2) the cost of purging the instances created as a result ofe's arrival upon their expiration. The former will be denoted as CP(T)and the latter as CR(T).

Both functions depend on the expected number of instances active at thetime of an event arrival. Reducing the number of instances (or, moregenerally, the size of intermediate results) is a common optimizationgoal in multiple fields, including database query optimization andcomplex event processing. For an order-based plan O=(E_(q) ₁ , . . . ,E_(q) _(m) ) detecting a pattern P=(

, S, C, W), this cost function is defined as:

${{{Cost}_{ord}\left( {O,P,{Stat}} \right)} = {\sum_{k = 1}^{❘\mathcal{E}❘}{{Cost}_{ord}^{k}\left( {O,P,{Stat}} \right)}}},$where Cost_(ord) ^(k) is the cost of the k^(th) state in the chain-basedNFA following O, calculated as follows:

${{{Cost}_{ord}^{k}\left( {O,P,{Stat}} \right)} = {W^{k} \cdot {\prod_{i = 1}^{k}{r_{q_{i}} \cdot {\prod_{i,{{j \leq k};{i \leq j}}}{sel_{q_{i},q_{j}}^{C}}}}}}},$where r_(i); i∈[1, m] and sel_(i,j) ^(C); i,j∈[1, n] are as definedabove.

In some embodiments, the above definition may be used to calculate theexpected number of instances existing simultaneously at any given momentduring MPT-based multi-pattern evaluation. Given a node N, let

_(N) denote the path from the root of the MPT to N (by definition of atree, there is always exactly one such path). For the root, there is set

_(R)=Ø. The total number of instances is the sum of numbers of instancesassociated with each NFA state (and hence with the corresponding MPTnode), calculated as follows:

#inst(MPT, WL, Stat) = ∑_(N ∈ MPT)Cost_(ord)^(❘𝒫_(N)❘)(𝒫_(N), WL, Stat).

Thus, to calculate the number of instances to be traversed upon arrivalof an event of type T, it is needed to sum the instances associated withthe states in l_(T):

#inst_(T)(MPT, WL, Stat) = ∑_(S ∈ l_(T))Cost_(ord)^(❘𝒫_(N(S))❘)(𝒫_(N(S)), WL, Stat),where N(S) denotes a node corresponding to S in MPT.

The processing cost per event is now derived as follows. Let C_(a) bethe cost of accessing an instance, C_(n) the cost of creating a newinstance and inserting it into the data structure, and C_(r) the cost ofremoving an instance from the system. In addition, let C_(ν)(T,

_(N)) denote the cost of verifying the conditions between a new event oftype T and the events preceding T in

_(N), and let Sel_(ν)(T,

_(N)) denote the total selectivity of the above conditions. To make andC_(ν) and Sel_(ν) well-defined, there is set C_(ν)=Sel_(ν)=0 if T∉

_(N). Then, the expected cost of processing a single event of type T is:

CP(T) = ∑_(S ∈ l_(T))(Cost_(ord)^(❘𝒫_(N)❘)(𝒫_(N(S)), WL, Stat) ⋅ (C_(a) + C_(v)(T, 𝒫_(N(S))) + Sel_(v)(T, 𝒫_(N(S))) ⋅ C_(n))).

To calculate the cost of removing the expired instances, it is observedthat the expected number of instances created in state S afterprocessing a new event of type T is equal to Sel_(ν)(T,

_(N(S))). Thus, the cost of eventually removing these instances upontheir expiration is:

CR(T) = ∑_(S ∈ l_(T))Cost_(ord)^(❘𝒫_(N(S))❘)(𝒫_(N(S)), WL, Stat) ⋅ Sel_(v)(T, 𝒫_(N(S))) ⋅ C_(r).

The above analysis emphasizes two main performance objectives of an MCEPsystem attempting to minimize the processing cost per event. First, thesharing degree needs to be maximized to reduce the sizes of the statelists l_(T). Second, the cost of the local evaluation plans in terms ofthe expected number of simultaneously existing instances has to be aslow as possible. As illustrated in FIG. 3 , there might be a conflictbetween these two objectives, which will be solved by defining anoptimization problem later on.

The extended formula for the expected number of instances represents thesame parameter dependencies as does the expression CP(T)+CR(T). Hence,it will be used as a cost function for measuring the quality of MPTs:

Cost_(ord)^(multi)(MPT, WL, Stat) = #inst(MPT, WL, Stat).

MCEP Optimization Problem

In some embodiments, the problem to be solved by an MCEP optimizer maybe formally defined as follows: Given an order-based plan O for apattern P and a multi-pattern tree MPT, O∈MPT if and only if MPTcontains a path

of length |O|, starting at the root and ending at some final state, suchthat the event types and the conditions specified on the transitions in

are identical to those of a NFA detecting P according to O. For example,an MPT in FIG. 6B satisfies O₃=(A, C, E)∈MPT. ORD_(P) denotes the set ofall valid order-based evaluation plans for P. For a pattern of size m,|ORD_(P)|=m!

Accordingly, a tree-based MCEP optimization problem (T-MCEP) may bedefined as follows: Given a workload WL of n patterns and a statisticscollection Stat, find a multi-pattern tree MPT minimizing the value ofthe cost function Cost_(ord) ^(multi) (MPT, WL, Stat) subject to

∀P_(i), 1 ≤ i ≤ n : ∃O ∈ ORD_(P)s.t.O ∈ MPT.

The path in the MPT corresponding to the evaluation order of a patternis denoted as P_(i) as

_(i).

The complexity of T-MCEP may be described as follows: It can be notedthat for n=1, the present problem is equivalent to the single-patternCEP optimization problem (SCEP), thoroughly discussed in previous work.In particular, it was shown in, e.g., I. Kolchinsky and A. Schuster,“Join query optimization techniques for complex event processingapplications.” PVLDB, 11(11):1332-1345, 2018, that SCEP is NP-completeby reducing it to the problem of join evaluation order generation. TheNP-completeness of this latter problem was in turn proven through areduction to the maximum clique problem. The maximum clique problem isnot only known to be NP-complete, but is also hard to approximate. Itwas demonstrated in that, unless NP=ZPP, no polynomial-time algorithmexists that approximates the problem within the factor of n^(1−ε), wheren is the size of the graph. By correctness of the reductions, thisresult applies also to the SCEP problem, and, by generalization, toT-MCEP.

Optimization Framework for T-MCEP

T-MCEP is a computationally hard optimization problem, characterized byan enormously large solution space and multiple local minima. Therefore,advanced techniques are needed in order to produce a high-qualitysolution under tight restrictions common for real-time MCEP systems.

The algorithms employed by the present optimizer to achieve this goalimplement the local search paradigm. Local search is a well-knownapproach for finding approximate solutions for hard optimizationproblems, based on executing heuristically guided random walks in thesolution space and searching for the cheapest solution subject to apredefined cost function. Local search methods are successfully appliedfor solving a wide range of problems, from the classic travelingsalesman problem to code design and VLSI layout synthesis.

Local search methods present several important benefits for real-timestreaming applications, and in particular for MCEP. Most importantly,they offer a tradeoff between the quality of the returned solution andthe running time of the search. Since the local search procedure keeps a“current best” solution at any point of its execution, it can always beinterrupted due to expired time limit and will return a valid solution,albeit not necessarily the cheapest. This property makes local searchmethods an attractive choice for targeting the MCEP optimization problemunder tight real-time constraints.

Multi-Pattern Graph

Let π_(χ)(Y) denote a projection of an expression Y on a set ofvariables χ. Y can be either a pattern structure or a condition set asdefined above, for example, π_((B, D))(SEQ (A, B, C, D))=SEQ(B, D).Given a pattern P=(

, S, C, W), another pattern P′=(

′,S′,C′,W′) is a subpattern of P (marked as P′⊆P) if

′⊆

, S′=π

,(S), C′=π_(ε),(C), and W′≤W.

A common subpattern P_(ij)=(

_(ij), S_(ij), C_(ij), W_(ij)) of two patterns P_(i), P_(j) is a patternsatisfying (P_(ij)⊆P_(i)){circumflex over ( )}(P_(ij)⊆P_(j)), such thatW_(ij)=min(W_(i), W_(j)). A maximal common subpattern of P_(i), P_(j) isa common subpattern P_(ij), such that no other common subpattern P′_(ij)satisfies P_(ij)⊆P′_(ij). Thus may be denoted by MP_(ij) herein. Inaddition, the set of all subsets of MP_(ij) is denoted by Γ_(ij), thatis, all common subpatterns of P_(i) and P_(j). Obviously, Γ_(ij)=Γ_(ji)for each i, j. The above definitions are trivially extended to anarbitrary number of intersecting patterns.

To illustrate the above notations, let P₁:SEQ(A,B,C,D) andP₂:SEQ(A,E,C,D). Assume that both patterns have no conditions and W₁=10,W₂=20. Then, SEQ(A, D), SEQ(C, D), and SEQ(A, C) with W=10 are commonsubpatterns of P₁ and P₂, while SEQ(C, A) is a subpattern of neither,since it has a conflicting structure. The maximal common subpattern isSEQ(A, C, D).

The multi pattern graph MPG=(V, E) is a data structure capable ofefficiently collecting, maintaining, and retrieving the informationregarding the mutual subpatterns of P₁, . . . , P_(n). For each patternP_(i), MPG contains a vertex ν_(i)∈V. For each pair of distinct patternsP_(i), P_(j) with non-empty intersection (i.e., satisfying Γ_(ij)≠Ø), anundirected edge e_(ij)=(ν_(i), ν_(j), Γ_(ij))∈E is defined.

FIG. 7 depicts an MPG for a workload of 6 patterns. For presentationclarity, edges with maximal common subpattern of size 1 are not shown.The triplet P₁, P₂ and P₃ share a maximal common pattern SEQ(A, C). P₃and P₄ have two distinct maximal common sub-patterns. P₆ is fullycontained in P₅.

In the general case, an MPG is an arbitrary, not necessarily connectedgraph. However, it can be noted that any algorithm solving T-MCEP can beactivated separately on each connected component, and the results canthen be combined to produce the final plan. Not only does thisobservation allow to solve the problem much more efficiently in thepresence of multiple components, but it also makes it possible to limitthe discussion below to connected graphs.

To guarantee an efficient local search procedure, the MPG has to occupysmall space. Moreover, addition and removal operations must be fast andlow-cost, and likewise for the retrieval of pattern intersectioninformation. By utilizing compact graph representation and advancedoptimizations, it is possible to guarantee near constant cost ofretrieval and worst-case linear cost of addition and deletion with nearlinear space complexity.

Local Search Algorithms for T-MCEP

A local search problem is specified by a pair (φ, f), where φ is a setof feasible problem solutions and f: φ→

is a cost function. The goal is then to find an optimal solution s* suchthat f(s*)≤f(s) for all s∈φ. In the case of T-MCEP, φ consists of allpossible MPTs and f=Cost_(ord) ^(multi).

The search starts from some initial solution s_(init). Local searchalgorithms traverse the search space by exploring the neighborhood ofthe current solution. A domain-specific neighborhood function

: φ→2^(φ) maps a solution to a set of its neighbors, i.e., solutionsthat can be obtained by performing a slight modification. The strategyfor performing the search is determined by the meta-heuristic in use. Alocal search algorithm for a given problem can be uniquely defined by acombination of a meta-heuristic and a neighborhood function. When apredefined stopping criterion is satisfied, the search terminates andthe cheapest observed solution is returned.

The local search algorithms employed by the present optimizer forsolving T-MCEP utilize two well-known meta-heuristics, simulatedannealing and Tabu search. It can be noted that the solution space ofthe present problem is enormously large. For a workload of size n, thereare Π_(i=1) ^(n)|P_(i)|!possible MPTs, where |P_(i)| denotes the numberof event types in the i^(th) pattern. Fortunately, closer analysis ofthe solution space will allow to immediately discard the overwhelmingmajority of the subplan combinations.

The following can be observed regarding the possible local evaluationorders for a pattern P_(i) in the shared workload. If no subset of P_(i)can be shared with other patterns, it only makes sense to select themost efficient evaluation order. Otherwise, for every shareablesub-pattern P′⊆P, it is required to consider an order that starts withthe best order O′ for P′, then continues with the best order for theremainder of the pattern given O′ as the prefix. Note that not only themaximal common subpatterns but also their subsets must be considered,including the empty subset (which is equivalent to the case when no suchP′ exists).

The following theorem will formally state the above in t:

-   -   Theorem 1: Let MPT_(opt) be the optimal multi pattern tree for        some workload W. Then, for each path        _(i) in MPT_(opt) corresponding to the pattern P_(i) at least        one of the following holds: (1)        _(i) is the optimal evaluation order for P_(i); (2)        _(i) can be divided into a non-empty prefix Pre f_(i) that is        shared with at least one additional pattern and a non-shared        suffix Suf f_(i), and it is the most efficient local evaluation        order for P_(i) out of those starting with Pre f_(i).

The proof is straightforward by assuming that neither (1) nor (2) holdand showing that MPT_(opt) can be improved by modifying Suf f_(i) tomake

^(i) the most efficient order starting with Pre f_(i), which contradictsthe optimality of MPT_(opt). Since Suf f_(i) is not shared bydefinition, improving it necessarily leads to an improvement ofMPT_(opt).

Theorem 1 reduces the maximal number of potential orders for a singlepattern from |P_(i)|! to Σ_(j=1) ^(n)|Γ_(ij)|. However, to apply theabove strategy, an algorithm is required to calculate local evaluationplans as described above. The existence of a deterministic local plangeneration algorithm

is assumed, capable of the following functionality:

-   -   Given a pattern P and the statistical event characteristics        Stat, return the cheapest local order-based evaluation plan O        subject to Cost_(ord).    -   Given a pattern P, its subpattern P′, an evaluation plan O′ for        P′, and the statistics collection Stat, return the cheapest        (subject to Cost_(ord)) local order-based evaluation plan O        starting with prefix O′.

Many algorithms answering the above requirements have been proposed. Inparticular, any greedy algorithm or an algorithm based on dynamicprogramming satisfies both conditions. While most algorithms are notguaranteed to produce an optimal result due to the NP-hardness of localevaluation plan generation, they provide empirically accurateapproximations.

With the above observation in mind, neighborhood functions for T-MCEPcan be defined. The first function produces a neighboring solution byselecting a random edge (ν_(i), ν_(j)) in the MPG and a commonsubpattern P∈Γ_(ij). P is restricted to be different from the subpatternthat is shared between P_(i) and P_(j) in the current MPT (however, itssubpatterns are allowed). A neighbor will be generated by invoking

to create new evaluation orders O_(i), O_(j) sharing a common prefixO_(p), and replacing

_(i),

_(i) with the resulting orders. This neighborhood will be denoted as anedge-based neighborhood and the notation

_(edge) will refer to it.

_(edge) (MPT) will denote the set of all solutions that can be obtainedby the above procedure. The size of the neighborhood produced by

$\mathcal{N}_{edge}{is}{\frac{1}{2} \cdot {\sum_{i = 1}^{n}{\sum_{{j = 1};{j \neq i}}^{n}{{❘\Gamma_{ij}❘}.}}}}$

The main drawback of

_(edge) is that it can only attempt pairwise sharing. In many real-lifescenarios, a single subexpression might be shared between patternscomprising a large fraction of the workload. While sharing suchsubexpression between all involved patterns may dramatically increasethe performance, only considering two of them may fail to produce animprovement over the plan not sharing the expression at all. As aresult, the sharing opportunity may be missed.

To overcome this limitation, a vertex-based neighborhood

_(vertex) may be defined. Let V_(i)=∪_((ν) _(i) _(ν) _(j) _()∈E)P_(ij)be called the vicinity of ν_(i). Instead of an edge, the neighborhoodfunction will select a vertex ν_(i) and a subpattern P in the vicinityof ν_(i). Then, let Γ_(P) denote a set of all patterns containing P.This set can be efficiently retrieved from the MPG as further describedbelow. min(k, |Γ_(P)|) patterns are selected, where k≥2 is a predefinedparameter. Then,

will be invoked to generate new evaluation orders sharing a commonprefix O_(p). The variation of

_(vertex) will be denoted using a particular value for k as

_(vertex) ^(k). Note that

_(vertex) ² is equivalent to

_(edge). The size of the neighborhood of

_(vertex) ^(k) is bounded by

$\sum_{i = 1}^{n}{\sum_{P \in V_{i}}{\begin{pmatrix}{❘\Gamma_{P}❘} \\k\end{pmatrix}.}}$

The per-step complexity of the neighborhood functions

_(edge) and

_(vertex) ^(k) is O(Σ_(i=1) ^(n) m_(i)·

), where

is the complexity of

. A step is defined as a single selection of a neighbor and evaluatingits cost.

In all algorithms, the initial state is set to the MPT in which allpatterns are evaluated according to the best possible local evaluationorders, that is,

_(i)=

(P_(i), Stat) for all i.

MCEP with Arbitrary Subexpression Sharing

The multi-pattern plan generation method above only considers prefixsharing. This introduces a significant limitation, since the optimizeris required to move common subpatterns to the MPT root in order to sharetheir computation. This mechanism also prevents a pattern from sharingmultiple distinct subexpressions with other patterns. As an example,consider a workload consisting of patterns P₁: SEQ(A, B, C, D), P₂:SEQ(A,E,C,F), P₃: SEQ(G,B,H,D). In order to share the subpatternSEQ(A,C) with P₂, the evaluation order of P₁ must start with (A, C) or(C, A). On the other hand, it has to start with (B, D) or with (D, B) toshare the subpattern SEQ(B, D) with P₃. The optimizer will have torefrain from sharing one of the subpatterns in this case.

In some embodiments, the present optimization framework is extended toarbitrary subexpression sharing. To that end, the local order-basedplans are replaced with tree-based plans, shaped as binary trees.Tree-based plans specify the structure for tree-based single-patternevaluation mechanisms. A leaf is defined for each event type, and theroot of the tree serves as a final state. The evaluation proceeds fromthe leaves towards the root, with each internal node responsible for asubpattern consisting of the event types in its subtree. FIG. 8 presentsthree possible tree-based plans for a pattern SEQ(A, B, C). Tree-basedevaluation mechanisms were shown by multiple studies to be moreexpressive and perform better than NFAs.

The tree-based evaluation process is similar to the one described forNFAs. As a new event arrives, an instance is created containing thisevent. Every instance corresponds to some subtree s of the tree-basedplan. A new instance I is combined with previously created “siblings”,that is, instances associated with a node sharing the parent with thenode of I. As a result, another instance containing the unified subtreeis generated. This process continues iteratively until the root of thetree is reached or no siblings are found.

Similarly to MPT, a multi pattern multitree (MPM) is defined as theglobal plan consisting of multiple shared tree-based plans. Each patternin an MPM has a dedicated root, and all leaves corresponding to the sameevent type are shared regardless of the plan in use. FIG. 9 depicts apossible MPM for a shared workload of patterns P₁: SEQ(A, B, C, D), P₂:SEQ(A,E,C,F), and P₃: SEQ(G,B,H,D). Note that the displayed plansuccessfully shares both subpatterns of P₁ with P₂ and P₃, a result thatcould not be achieved using an order-based approach.

The multitree-based MCEP optimization problem (M-MCEP) will be definedsimilarly to T-MCEP. The formal definitions of M-MCEP, the new costfunctions Cost_(tree) and Cost_(tree) ^(multi), and the correspondingextension of Theorem 1 can be found further below.

The MPM is created and modified similarly to the MPT. The complexity ofthe operations is not altered by switching to tree-based plans, as thenumber of nodes in a local tree-based plan is still linear in the numberof the participating event types. In addition, the existence of asubtree T in an MPM can be tested in constant time (and an additionalO(Σ_(i=1) ^(n)m_(i)) space) by hashing the subtrees upon creation. Thecomplexity analysis of runtime evaluation detailed above also remainsunchanged for the multitree model, with the exception of the costfunction Cost_(ord) ^(multi) being replaced with Cost_(tree) ^(multi).

The local search process for MPMs functions as described for MPTs above.However, now it is possible for a pattern to share multiple disjointsubtrees. Consider a situation where one such subpattern {circumflexover (P)}₁ is already shared, and the optimizer attempts to share thesecond subpattern {circumflex over (P)}₂ during the local search step.In this case, consider two separate options: (1) the most efficient treecontaining {circumflex over (P)}₂ regardless of the existing sharing of{circumflex over (P)}₁; and (2) the most efficient tree containing both{circumflex over (P)}₁ and {circumflex over (P)}₂. This case can begeneralized to sharing q subtrees and considering the (q+1)^(th) one.Due to this extension,

is required to support multiple subtrees. More formally,

is required to be capable of the following:

-   -   Given a pattern P and the statistical event characteristics        Stat, return the cheapest local tree-based evaluation plan T        subject to Cost_(tree).    -   Given a pattern P, a set of tree-based plans γ for some        subpatterns of P, and the statistics collection Stat, return the        cheapest (subject to Cost_(tree)) local tree-based evaluation        plan T containing all trees in γ as subtrees.

Algorithms for tree-based plan generation satisfying the aboverequirements are discussed in, e.g., I. Kolchinsky and A. Schuster. Joinquery optimization techniques for complex event processing applications.PVLDB, 11(11):1332-1345, 2018; Y. Mei and S. Madden. ZStream: acost-based query processor for adaptively detecting composite events. InSIGMOD Conference, pages 193-206. ACM, 2009.

When selecting the next state to be returned, the neighborhood functionswill randomly choose whether existing shared subtrees should bepreserved for the patterns involved. For

_(vertex) ^(k), this decision will be performed independently for eachof the k patterns sharing a common subpattern. To apply thismodification, sharing information must be stored for each pattern in theMPG, which adds a memory requirement of O(n·max_(i)(|

|)). No further changes to the structure and the operations of the MPGare necessary for the tree-based evaluation model.

Experimental Evaluation

The present inventors have conducted an experimental evaluation toassess the overall system performance achieved by the present approach,as compared to the state-of-the-art methods for MCEP, and analyze theimpact of the various parameters on the quality of the generated globalplans.

6.1 Experimental Setup

Two independent datasets were used in the experiments. The first wastaken from the NASDAQ stock market historical records [65]. Each datarecord represents a single update to the price of a stock, spanning a1-year period and covering over 2100 stock identifiers with pricesperiodically updated. The input stream contained 80,509,033 primitiveevents, each consisting of a stock identifier, a timestamp, and acurrent price. The event format was also augmented with theprecalculated difference between the current and the previous price ofeach stock. Updates of each stock identifier are considered as eventsbelonging to a separate type.

The structure of the patterns in the workloads generated for thisdataset was motivated by the problem of monitoring the relative changesin stock prices. Each pattern represented either a sequence or aconjunction of a number of event types and included a number ofpredicates, roughly equal to half the pattern size, comparing thedifference attributes of two of the involved event types. In addition,about 20% of the patterns contained either a negation or a Kleeneclosure operator on some event type. As discussed above, theaforementioned combinations of pattern operators are sufficient to coverthe whole spectrum of pattern structures. For example, a typicalsequence pattern of size 3 is as follows:

P₁ : SEQ(MSFT, Kleene(GOOG), APPL); C₁ = {MSFT.diff < APPL.diff}.

The second dataset contains the vehicle traffic sensor data, provided bythe city of Aarhus, Denmark [6] and collected over a period of 4 monthsfrom 449 observation points, with 13,577,132 primitive events overall.Each event represents an observation of traffic at the given point. Theattributes of an event include, among others, the point ID, the averageobserved speed, and the total number of observed vehicles during thelast 5 minutes. The patterns created for this dataset followed the rulesspecified above and were motivated by normal driving behavior, where theaverage speed tends to decrease with the increase in the number ofvehicles on the road. The user-defined task is detecting the violationsof this model, that is, combinations of three or more observations witheither an increase or a decline in both the number of vehicles and theaverage speed.

Unless stated otherwise, all arrival rates and predicate selectivitieswere calculated in advance during the preprocessing stage. The measuredarrival rates varied between 2 and 47 events per second, and theselectivities ranged from 0.003 to 0.92.

The workloads were created by grouping the patterns generated asdescribed above based on a set of parameters, including the number ofpatterns in a workload, average pattern size (number of event types in apattern), and pattern time window. Unless stated otherwise, the defaultvalues were set to 100 patterns per workload, an average pattern size of5 event types, and the time window of 15 minutes.

Unless stated otherwise, all experiments were conducted on the fullversion of the present MCEP optimizer presented above. The default localsearch time limit for all algorithms was set to 180 seconds. Thealgorithm used as the local plan generation algorithm

is based on dynamic programming described in I. Kolchinsky et al., “Joinquery optimization techniques for complex event processingapplications.” PVLDB,11(11):1332-1345, 2018.

Throughput, defined as the number of events processed per second duringpattern detection, was selected as the main performance metric. However,similar results could be obtained for algorithms targeting any otheroptimization goal, such as minimizing latency, power consumption, orcommunication cost.

All experiments were repeated on 10 independently generated workloads,and the displayed results were averaged among all trials. All models andalgorithms were implemented in Java. The experiments were run on amachine with 2.20 Ghz CPU and 16.0 GB RAM.

Experimental Results—Impact of Input Parameters on System Performance

The first experiment evaluated the performance of the local searchalgorithms described above, as a function of the workload size. FIG. 10shows throughput gain as a function of the workload size for differentcombinations of a meta-heuristic, a neighborhood function, asubexpression sharing strategy, and a dataset: FIG. 10A depicts stocksdataset, simulated annealing; FIG. 10B depicts stocks dataset, Tabusearch; FIG. 10C depicts traffic dataset, simulated annealing; and FIG.10D depicts traffic dataset, Tabu search.

Here and in all subsequent experiments, the graphs show the relativethroughput gain over the trivial global evaluation plan, utilizing nosharing and no rewriting techniques. The neighborhoods

_(edge),

_(vertex) ⁴, and

_(vertex) ⁸ were tested in conjunction with simulated annealing and Tabusearch meta-heuristics on stock (FIGS. 10A-10B) and traffic (FIGS.10C-10D) datasets. For

_(edge) alone, the prefix-only version of the present framework wasevaluated in addition to the default arbitrary-subset version.

Overall, all combinations demonstrated more significant throughput gainsfor larger workloads, ranging from a factor of 21 to over 72. Despitebeing the simplest,

_(edge) neighborhood showed the best results, finding evaluation plansthat outperformed the trivial plan by a factor of up to 72.7 for thestock dataset and up to 50.7 for the traffic dataset. This can beexplained by the overwhelming size of the neighbor spaces explored by

_(vertex) ⁴ and

_(vertex) ⁸. Tight time constraints prevent the system from locating thebest optimization opportunities in huge neighborhoods. Thus, although

_(vertex) neighborhoods contain all of the moves in

_(edge), the better moves are statistically harder to reach before thetime expires. Comparable performance was observed for bothmeta-heuristics, with simulated annealing slightly outperforming Tabusearch for the stock dataset and vice versa for the traffic dataset.

The choice of a subexpression sharing strategy was found to have a majorimpact on the system performance. When the optimizer was restricted toonly consider sharing prefixes, applying the generated plans resulted inup to 5 times lower throughput (marked as ‘EDGE-PREFIX’ in all graphs)as compared to the plans produced using an identical setup without theabove limitation (marked as ‘EDGE’). This observation fully matchesprior analysis. As discussed above, a prefix-only approach ignores asignificant fraction of the space of possible optimizations and limits apattern to only sharing a single subexpression by utilizing order-basedlocal plans as opposed to tree-based ones.

The scalability of the present optimizer was further assessed as subjectto various parameters (FIGS. 11A-11 d). Simulated annealing (marked as‘SA’ in the graph) and Tabu search (marked as ‘TS’) were again evaluatedon both datasets in conjunction with the best-performing neighborhood

_(edge). FIG. 11A depicts the throughput gain as a function of theaverage length of a pattern in a workload. The present approach seems toimprove even more for longer patterns, speeding up the event processingby up to two orders of magnitude. This is not surprising, as longerpatterns introduce more optimization opportunities. It was also observedthat in most cases the simulated annealing meta-heuristic achievedbetter performance than Tabu search.

Unsurprisingly, the output plan quality also improves with increasedtime limit of the local search algorithm (FIG. 11B). Interestingly, theperformance of simulated annealing seems to converge to a constantvalue, while Tabu search keeps improving for longer time limits. Thiscan be explained by the distinctive behavior of the former after a largenumber of iterations, when the current threshold becomes small enoughfor the algorithm to converge to a local minimum.

The results obtained for different time window sizes (FIG. 11C)demonstrate similar trends. Since the cost function and the overallsystem throughput strictly depend on this parameter, increasing it leadsto bigger differences in plan qualities, both calculated and empiricallyobserved.

Finally, an experiment with patterns utilizing count-based windows wasconducted. As opposed to specifications based on time-based windowsdefined above, count-based patterns require a match to appear within thelast W arrived events rather than within Wtime units.

FIG. 11D presents the results. For the stock dataset, even biggerperformance boost was observed for larger windows as compared to thetime-based scenario. This can be explained by the highly fluctuatingevent arrival rates exhibited by this dataset. When time-based windowsare used, the peak load is only experienced during brief ‘bursts’,whereas large count-based windows cause the system to be constantlyoverloaded. Since the performance gain achieved by an efficientevaluation plan is proportional to the average system load, the lattercase demonstrates a more significant increase in total throughput. Incontrast, the results for the traffic dataset were extremely similar tothose obtained for time-based windows due to much less skew in eventdistribution over the input stream.

Experimental Results—Comparison with Known Methods

The experiments summarized in FIGS. 10 and 11 were repeated for thebasic sharing and the basic reordering methods, as well as for otherknown MCEP methods.

The basic sharing method (SH) refers to the maximal subexpressionsharing technique discussed above. The basic reordering method (RE)greedily rebuilds the event sequence by picking the event typemaximizing the cost function at each step.

The SPASS method (see M. Ray et al., “Scalable pattern sharing on eventstreams”, In Proceedings of the 2016 International Conference onManagement of Data, pages 495-510, New York, NY., USA, 2016. ACM)selects the sub-patterns to share according to a metric called‘redundancy ratio.’ This method metric represents the potential gain insharing its computation. Each subexpression is assigned a score, and thewinners are chosen by approximating the well-known minimal substringcover problem. The MOTTO method (see S. Zhang et al. “Multi-queryoptimization for complex event processing in SAP ESP.”, In 33rd IEEEInternational Conference on Data Engineering, ICDE 2017, San Diego, CA,USA, April 19-22, 2017, pages 1213-1224, 2017) utilizes a combination oftechniques referred to as MST (merge sharing technique), DST(decomposition sharing technique), and OTT (operator transformationtechnique). The system solves the directed Steiner minimum tree problemto select the best global plan produced using the above techniques.

FIGS. 12A-12H present the results. The redundancy ratio method and themerge-decomposition technique are marked as SH-RR and SH-MDTrespectively. While both SH-RR and SH-MDT scale well with growingworkload size (FIGS. 12A, 12E) and average pattern length (12B, 12F),the present optimizer achieves the best overall speedup, in some casesup to three times better than that of the runner-up solution.

This result follows from utilizing the reordering opportunities, whichwere shown to drastically boost CEP evaluation. On the other hand, thepresent approach also attempts to exploit sharing opportunities whenpossible, which allows it to outperform the pure reordering algorithm(RE) for large pattern sizes. The gaps were closer for time windowevaluation (FIGS. 12C and 12G), with SA-EDGE still achieving anadvantage of at least 25% over the second-best method. The results forcount-based windows (FIGS. 12D and 12H) strictly follow the trendsdescribed for FIG. 11 .

Experimental Results—Adaptive System Behavior

Next, the performance of the present system was evaluated in thepresence of a dynamically changing input stream. For this experimentalone, semi-synthetic input was used. A component was implemented thataccepts a parameter x and randomly and independently transforms every xincoming events before they are received by the evaluation mechanism. Atransformation is performed by randomly picking y event types, creatingtheir random permutation P and then replacing the type attribute ofevery affected event with the one following its value in P. Thismodification allows to simulate rapid and drastic changes in the arrivalrates of all types of events.

The experiment was repeated for y=5 and x ranging between 10 and 1000 onthe static and the dynamic version of the present framework. In thestatic case, an evaluation plan was created on startup and usedexclusively regardless of input changes. The dynamic version utilized anadaptive approach, restarting the plan calculation process when adrastic change in the statistics is detected. The results are depictedin FIGS. 13A-13B. Unsurprisingly, the initially generated plan fails toperform adequately when the input characteristics overcome on-the-flychanges. While extremely frequent input changes clearly reduce systemperformance, the adaptive method still leads to at least 10 times higherthroughput.

Additional Experiments

Further experiments were conducted to study the influence of theworkload statistical characteristics on the performance of the presentoptimizer. Only the best performing (according to the results presentedabove) combinations SA-EDGE and TS-EDGE were evaluated.

The statistical characteristics of workload generation are controlledusing a pair of configurable parameters, multi pattern graph density andnormalized arrival rate difference. The multi-pattern graph density isdefined as an average relative number of neighbors of a given pattern inan MPG. For example, in a workload of 100 patterns with MPG densityequal to 0.5, each pattern will have 50 neighbors on average. Thisparameter is used to control the sharing sensitivity of a workload.

The arrival rate difference, defined as the maximal difference in ratesof two event types within a single pattern, allows to manipulate thereordering sensitivity of a workload. For example, for an unconditionalconjunction of 5 event types arriving at an identical rate, each of thepossible 5! evaluation orders will have the same cost. However, if oneof the types appears 100 times more frequently than the rest, the gainobtained by postponing the costly event type to the last state isconsiderably high. Patterns with varying degrees of reorderingsensitivity are produced by limiting the selection of the event typesfor a pattern accordingly. The values of this parameter were normalizedwith respect to the maximal observed difference of 45.

FIGS. 14A-14D depict the achieved throughput gain as a function of thesharing sensitivity (FIGS. 14A-14B) and the reordering sensitivity(FIGS. 14C-14D) of the workload. The plots also show the performance ofthe basic reordering (RE) and the basic sharing (SH) methods discussedabove.

The high gains of the local search methods do not exhibit dominantdependencies on either of the two parameters. While larger graphdensities and rate difference limits introduce more sharing andreordering opportunities, they also increase the search space size andthe number of potential local minima. Nevertheless, the present approachconsistently outperforms the better of SH and RE for every attemptedexperimental configuration. At the extremes, local search tends toresort to an almost pure sharing plan for low arrival rate differences(since virtually no improvement can be achieved by reordering), whereasfor sparse multi-pattern graphs the solution assigning the best localplan to all patterns is often preferred.

The basic reordering method becomes more efficient with increasingdifferences in arrival rate and is almost unaffected by the changes ingraph density. The performance of the basic sharing method increasesmonotonically with graph density. It also decreases with the ratedifference due to the smaller number of participating event types inmore restricted workloads. Given a pair of workloads of the same sizecontaining patterns of the same length, the workload with fewer eventtypes will have more events of the same type on average, and is expectedto offer more sharing opportunities.

Efficient Implementation of the Multi-Pattern Graph

As presented above, the multi-pattern graph for the workload WL={P₁, . .. , P_(n)} is defined as MPG=(V, E), where E={e_(i)=(ν_(i), ν_(j),Γ_(ij)≠Ø} and V={ν_(i)|P_(i)∈WL}.

This formulation introduces potential performance issues. First,explicitly storing the set of common subpatterns Γ_(ij) requiresO(2^(s)) memory, where s is the size of the maximal common subpattern.This can be solved by only storing the MP_(ij) instead, as the rest ofthe common subpatterns can be inferred from it. Second, when m patternsshare the same subpattern, the MPG will contain

$\begin{pmatrix}m \\2\end{pmatrix}$edges representing the same subpattern set. Consequently, directlyinstantiating the MPG in memory would be extremely inefficient.

The present disclosure addresses this shortcoming by compact graphrepresentation. Rather than explicitly store the vertices and the edges,for every distinct maximal common sub-pattern MP of some set of patternsΓ, Γ is kept in a hash table with MP as a key. In addition, a secondhash table maps a single pattern P to a list of maximal commonsubpatterns with its peers in MPG. This data structure still containsall the necessary information, additionally providing near constant costof retrieval and worst case linear cost of addition and deletion ofpatterns. The space occupied by both hash tables is O(n·γ), where γ isthe total number of distinct maximal common subpatterns in the workload.While the value of γ can reach n² in the worst case (and even exceed itin some cases), the way in which the hash tables are constructed makesit extremely unlikely for the space complexity to surpass O(n²).

Another potential performance bottleneck associated with the MPG is theresource-consuming operation of calculating the maximal commonsubpatterns for all pairs of patterns. Accordingly, the following simpleand efficient implementation will be utilized. Given P_(i)=(

_(i), S_(i), C_(i), W_(i)) and P_(j)=(

_(j), S_(j), C_(j), W_(j)), first a simple set intersection

_(ij) of

_(i) and

_(j) is calculated. Then, the conditions in C_(i) and C_(j) is projectedon S_(ij), and the resulting condition sets are compared. If the setsare not equal, their intersection is calculated and

_(ij) reduced accordingly. The same procedure is then performed forS_(i) and S_(j). Overall, the worst-case complexity of this operation isO(max(|

_(i)|, |

_(j))+max(|C_(i)|, |C_(j)|)).

Note that multiple maximal common subpatterns may exist. For example,both SEQ(A,B) and SEQ (A, C) are the maximal intersections of thesequences SEQ (A, B, C) and SEQ(A,C,B). In this case, the MPG will storea list of maximal common subpatterns.

The worst-case complexity of computing all maximal common subpatterns isthen O(n²·(s_(max)+c_(max))), where s_(max) and c_(max) denote themaximum sizes of a pattern in terms of events and conditions,respectively.

Local Search Meta-Heuristics

In some embodiments, local search meta-heuristics, simulated annealingand Tabu search are used herein.

Simulated annealing extends the functionality of iterative improvementby also allowing limited non-improving moves. A threshold c_(k) isdefined for each step. When a better neighbor solution is selected, itis chosen to replace the current solution, in a manner similar to theiterative improvement algorithm. If the neighbor solution is moreexpensive, it is accepted with probability

${\exp\left( {- \frac{\Delta f}{c_{k}}} \right)},$where Δf is the difference between the costs of the old and the newsolutions. The thresholds are chosen such that c_(k)=α·c_(k-1), α<1. Thealgorithm starts with a sufficiently large c₀ and terminates when apredefined small value c_(k) is reached. Before the start of the actualsearch, c₀ is set to the largest difference observed during evaluationof I neighbors of s_(init). In the experiments detailed above, α=0.99and I=10³ neighbors were used for setting the initial threshold.

Tabu search explores L random neighbors during each step and moves tothe cheapest of them. Visiting the same state twice is prohibited. Toenforce that, previously visited solutions are stored in a dedicatedtabu list. The tabu list has a finite capacity C: when the number ofstored solutions reaches C, oldest stored solutions are removed. Thebest solution s*observed during the run of the algorithm is returned. Amemory list of capacity C=10⁴ and L=100 was used during thisexperimental evaluation.

Both algorithms stop after reaching a predefined number of steps sincethe last improvement to s*or when the time expires. To study thetradeoff between evaluation time and solution quality, only thetimestamp-based stop condition was implemented.

Formal Definition of M-MCEP

The cost function and the optimization problem of multitree-based MCEPare formally define.

First, the cost function is extended. Let T_(i) denote a localtree-based evaluation plan for a pattern P_(i). Next, the cost functiondefinition for tree-based plans is borrowed from (Kolchinsky [2018]).For a plan T_(i), Cost_(tree)(T)=Σ_(N∈nodes(T))C(N), are defined where

${C(N)} = \left( \begin{matrix}{W_{i} \cdot r_{j}} & {N{is}a{leaf}{representing}E_{j}} \\{{C(L)} \cdot {C(R)} \cdot {sel}_{L,R}} & {N{is}{an}{internal}{node}{with}} \\ & {{child}{nodes}L{and}{R.}}\end{matrix} \right.$

Here, sel_(L,R) denotes the total selectivity of all conditions definedbetween the event types in L and R.

The extension of Cost_(tree), for multitrees will be defined by countingthe individual costs of all nodes in a multitree:

Cost_(tree)^(multi)(MPM) = ∑_(N ∈ nodes(MPM))C(N).

Given a tree-based plan T and a multi-pattern multitree MPM, it is saidthat T∈MPM if and only if MPM contains a subtree identical to T. Asubtree of the MPM will be denoted corresponding to a pattern p_(i) as

. In addition, TREE_(P) will denote the set of all tree-based plans of apattern P. The extended optimization problem will be subsequentlydefined as follows:

-   -   Multitree-based multi-pattern CEP optimization problem (M-MCEP).        Given a workload WL of n patterns and a statistics collection        Stat, find a multi-pattern multitree MPM minimizing the value of        Cost_(tree) ^(multi)(MPM, WL, Stat) subject to        ∀P_(i), 1≤i≤n: ∃T∈TREE _(p) s.t.T∈MPM.

Since T-MCEP can be viewed as a particular case of M-MCEP (restricted toleft-deep trees as local plans), the complexity results obtained forT-MCEP hold for M-MCEP by generalization.

To justify the use of

and

for MPM-based solution space, an observation similar to the onepresented in Theorem 1 is utilized.

-   -   Theorem 2. Let MPM_(opt) be the optimal multi pattern multitree        for some workload W. For each tree        in MPM_(opt) corresponding to the pattern P_(i), let        denote the set of subtrees that are shared with other patterns        in MPM_(opt). Then,        is the most efficient local tree-based plan for P_(i) out of        those containing all the subtrees in        .

As will be appreciated by one skilled in the art, aspects of the presentinvention may be embodied as a system, method or computer programproduct. Accordingly, aspects of the present invention may take the formof an entirely hardware embodiment, an entirely software embodiment(including firmware, resident software, micro-code, etc.) or anembodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical storagedevice, a magnetic storage device, or any suitable combination of theforegoing. In the context of this document, a computer readable storagemedium may be any tangible medium that can contain or store a programfor use by or in connection with an instruction execution system,apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent invention may be written in any combination of one or moreprogramming languages, including an object-oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a hardware processor of a general-purpose computer,special purpose computer, or other programmable data processingapparatus to produce a machine, such that the instructions, whichexecute via the processor of the computer or other programmable dataprocessing apparatus, create means for implementing the functions/actsspecified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

The flowcharts and block diagrams in the Figs. illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks shown in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flowchart illustration, andcombinations of blocks in the block diagrams and/or flowchartillustration, can be implemented by special purpose hardware-basedsystems that perform the specified functions or acts, or combinations ofspecial purpose hardware and computer instructions.

The descriptions of the various embodiments of the present inventionhave been presented for purposes of illustration but are not intended tobe exhaustive or limited to the embodiments disclosed. Manymodifications and variations will be apparent to those of ordinary skillin the art without departing from the scope and spirit of the describedembodiments. The terminology used herein was chosen to best explain theprinciples of the embodiments, the practical application or technicalimprovement over technologies found in the marketplace, or to enableothers of ordinary skill in the art to understand the embodimentsdisclosed herein.

In the description and claims of the application, each of the words“comprise” “include” and “have”, and forms thereof, are not necessarilylimited to members in a list with which the words may be associated. Inaddition, where there are inconsistencies between this application andany document incorporated by reference, it is hereby intended that thepresent application controls.

What is claimed is:
 1. A system comprising: at least one hardwareprocessor; and a non-transitory computer-readable storage medium havingstored thereon program instructions, the program instructions executableby the at least one hardware processor to: receive, as input, a datastream representing events; receive, as input, a plurality of complexevent patterns (CEPs), each representing an occurrence of a respectiveCEP in said data stream, wherein each of said CEPs comprises (a) a setof conditions reflecting relations among said events, and (b) a set ofattributes associated with each of said events; and calculate an optimalmulti-pattern evaluation plan corresponding to said plurality of CEPs,wherein said multi-pattern evaluation plan is created by: (i) generatingan initial evaluation plan, (ii) reordering events in at least one CEP,to calculate modified versions of said initial evaluation plan, (iii)assigning a score to each of said modified versions based on a costfunction, and (iv) selecting one of said modified versions having ahighest said score as said optimal multi-pattern evaluation plan.
 2. Thesystem of claim 1, wherein reordering of said events in each of saidCEPs comprises maximizing common sub-patterns among said CEPs, andwherein the at least one processor is further configured to share saidcommon sub-patterns among all of said CEPs.
 3. The system of claim 1,wherein said cost function minimizes a number of estimated intermediateresults during an execution of said modified version.
 4. The system ofclaim 1, wherein steps (ii) and (iii) are repeated iteratively based onone of: a specified time limit, and a specified number of iterations. 5.The system of claim 1, wherein said CEPs are based on user definition.6. The system of claim 1, wherein said program instructions are furtherexecutable to execute said multi-pattern evaluation plan on said datastream, to generate output data.
 7. A method comprising: receiving, asinput, a data stream representing events; receiving, as input, aplurality of complex event patterns (CEPs), each representing anoccurrence of a respective CEP in said data stream, wherein each of saidCEPs comprises (a) a set of conditions reflecting relations among saidevents, and (b) a set of attributes associated with each of said events;and calculating an optimal multi-pattern evaluation plan correspondingto said plurality of CEPs, wherein said multi-pattern evaluation plan iscreated by: (i) generating an initial evaluation plan, (ii) reorderingevents in at least one CEP, method calculate modified versions of saidinitial evaluation plan, (iii) assigning a score to each of saidmodified versions based on a cost function, and (iv) selecting one ofsaid modified versions having a highest said score as said optimalmulti-pattern evaluation plan.
 8. The method of claim 7, whereinreordering of said events in each of said CEPs comprises maximizingcommon sub-patterns among said CEPs, and wherein the at least oneprocessor is further configured to share said common sub-patterns amongall of said CEPs.
 9. The method of claim 7, wherein said cost functionminimizes a number of estimated intermediate results during an executionof said modified version.
 10. The method of claim 7, wherein steps (ii)and (iii) are repeated iteratively based on one of: a specified timelimit, and a specified number of iterations.
 11. The method of claim 7,wherein said CEPs are based on user definition.
 12. The method of claim7, further comprising executing said multi-pattern evaluation plan onsaid data stream, to generate output data.
 13. A computer programproduct comprising a non-transitory computer-readable storage mediumhaving program instructions embodied therewith, the program instructionsexecutable by at least one hardware processor to: receive, as input, adata stream representing events; receive, as input, a plurality ofcomplex event patterns (CEPs), each representing an occurrence of arespective CEP in said data stream, wherein each of said CEPs comprises(a) a set of conditions reflecting relations among said events, and (b)a set of attributes associated with each of said events; and calculatean optimal multi-pattern evaluation plan corresponding to said pluralityof CEPs, wherein said multi-pattern evaluation plan is created by: (i)generating an initial evaluation plan, (ii) method events in at leastone CEP, to calculate modified versions of said initial evaluation plan,(iii) assigning a score to each of said modified versions based on acost function, and (iv) selecting one of said modified versions having ahighest said score as said optimal multi-pattern evaluation plan. 14.The computer program product of claim 13, wherein reordering of saidevents in each of said CEPs comprises maximizing common sub-patternsamong said CEPs, and wherein the at least one processor is furtherconfigured to share said common sub-patterns among all of said CEPs. 15.The computer program product of claim 13, wherein said cost functionminimizes a number of estimated intermediate results during an executionof said modified version.
 16. The computer program product of claim 13,wherein steps (ii) and (iii) are repeated iteratively based on one of: aspecified time limit, and a specified number of iterations.
 17. Thecomputer program product of claim 13, wherein said CEPs are based onuser definition.
 18. The computer program product of claim 13, whereinsaid program instructions are further executable to execute saidmulti-pattern evaluation plan on said data stream, to generate outputdata.